Simbed: Similarity-Based Embedding
نویسندگان
چکیده
Simbed, standing for similarity-based embedding, is a new method of embedding high-dimensional data. It relies on the preservation of pairwise similarities rather than distances. In this respect, Simbed can be related to other techniques such as stochastic neighbor embedding and its variants. A connection with curvilinear component analysis is also pointed out. Simbed differs from these methods by the way similarities are defined and compared in both the data and embedding spaces. In particular, similarities in Simbed can account for the phenomenon of norm concentration that occurs in high-dimensional spaces. This feature is shown to reinforce the advantage of Simbed over other embedding techniques in experiments with a face database.
منابع مشابه
Link Prediction using Network Embedding based on Global Similarity
Background: The link prediction issue is one of the most widely used problems in complex network analysis. Link prediction requires knowing the background of previous link connections and combining them with available information. The link prediction local approaches with node structure objectives are fast in case of speed but are not accurate enough. On the other hand, the global link predicti...
متن کاملDifficulty of Processing Japanese and Korean Center- embedding Constructions
This research investigates the effects of syntactic, semantic, and morphophonemic similarity in the processing of center-embedding constructions in Japanese and Korean. Six Japanese experiments study the effects of syntactic and semantic similarity, while one Korean experiment deals with the effects of syntactic and morphophonemic similarity. The results from these experiments support the view ...
متن کاملUncertainty in Neural Network Word Embedding: Exploration of Threshold for Similarity
Word embedding, specially with its recent developments, promises a quantification of the similarity between terms. However, it is not clear to which extent this similarity value can be genuinely meaningful and useful for subsequent tasks. We explore how the similarity score obtained from the models is really indicative of term relatedness. We first observe and quantify the uncertainty factor of...
متن کاملModel Based Method for Determining the Minimum Embedding Dimension from Solar Activity Chaotic Time Series
Predicting future behavior of chaotic time series system is a challenging area in the literature of nonlinear systems. The prediction's accuracy of chaotic time series is extremely dependent on the model and the learning algorithm. On the other hand the cyclic solar activity as one of the natural chaotic systems has significant effects on earth, climate, satellites and space missions. Several m...
متن کاملHeterogeneous Information Network Embedding for Meta Path based Proximity
A network embedding is a representation of a large graph in a lowdimensional space, where vertices are modeled as vectors. The objective of a good embedding is to preserve the proximity (i.e., similarity) between vertices in the original graph. This way, typical search and mining methods (e.g., similarity search, kNN retrieval, classification, clustering) can be applied in the embedded space wi...
متن کامل